AITopics | negative curvature

Collaborating Authors

negative curvature

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

simple-saddle-camera-version

张辰逸

Neural Information Processing SystemsApr-25-2026, 17:14:13 GMT

Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function f: Rn!R, it outputs an -approximate second-order stationary point in O(logn/ 1.75)iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with O(log4 n/ 2) or O(log6 n/ 1.75) iterations, our algorithm is polynomially better in terms of logn and matches their complexities in terms of 1/ .

algorithm, artificial intelligence, machine learning, (13 more...)

Neural Information Processing Systems

Country: Asia (0.15)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.57)

Add feedback

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Neural Information Processing SystemsMar-17-2026, 02:37:13 GMT

Negative curvature descent (NCD) method has been utilized to design deterministic or stochastic algorithms for non-convex optimization aiming at finding second-order stationary points or local minima. In existing studies, NCD needs to approximate the smallest eigen-value of the Hessian matrix with a sufficient precision (e.g., $\epsilon_2\ll 1$) in order to achieve a sufficiently accurate second-order stationary solution (i.e., $\lambda_{\min}(\nabla^2 f(\x))\geq -\epsilon_2)$. One issue with this approach is that the target precision $\epsilon_2$ is usually set to be very small in order to find a high quality solution, which increases the complexity for computing a negative curvature. To address this issue, we propose an adaptive NCD to allow for an adaptive error dependent on the current gradient's magnitude in approximating the smallest eigen-value of the Hessian, and to encourage competition between a noisy NCD step and gradient descent step. We consider the applications of the proposed adaptive NCD for both deterministic and stochastic non-convex optimization, and demonstrate that it can help reduce the the overall complexity in computing the negative curvatures during the course of optimization without sacrificing the iteration complexity.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Adaptive Negative Curvature Descent with Applications in Non-convex Optimization

Mingrui Liu, Zhe Li, Xiaoyu Wang, Jinfeng Yi, Tianbao Yang

Neural Information Processing SystemsFeb-15-2026, 06:44:20 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, complexity, time complexity, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > Iowa > Johnson County > Iowa City (0.14)
North America > Canada > Quebec > Montreal (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

Zeroth-OrderNegativeCurvatureFinding: Escaping SaddlePointswithoutGradients

Neural Information Processing SystemsFeb-12-2026, 23:32:16 GMT

Several classical results have shown that, forρ-Hessian Lipschitz functions (see Definition 1), using the second-order information like computing the Hessian [33] or Hessian-vector products [1, 9, 2], one can find anϵ-approximate second-order stationary point (SOSP, f(x) ϵ and 2f(x) ρϵI).

algorithm, artificial intelligence, machine learning, (16 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.31)

Add feedback

First-order Stochastic Algorithms for Escaping From Saddle Points in Almost Linear Time

Yi Xu, Rong Jin, Tianbao Yang

Neural Information Processing SystemsFeb-12-2026, 10:17:16 GMT

For finding a nearly second-orderstationary pointxsuchthatk F(x)k and 2F(x) I (in high probability), the best time complexity of the presented algorithms is eO(d/3.5),whereF(

algorithm, artificial intelligence, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > Jordan (0.05)
North America > United States (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

Appendices

Neural Information Processing SystemsFeb-12-2026, 00:13:15 GMT

The Hessian of f(Z) can be viewed as an KN KN matrix by vectorizing the matrix Z. For deeper linear networks, it can be shown that flat saddle points exist at the origin, but there are no spurious local minima [34,37]. While most of these results based on the bottom-up approach explain optimization and generalization of certain types of deep neural networks, they provided limited insights into the practice of deep learning. In fact, our proof techniques are inspired by recent results on low-rank matrix recovery [77,80]. Some of the metrics are similar to those presented in [1]. Figure 7 depicts the learning curves in terms of both the training and test accuracy for all three optimization algorithms (i.e., SGD, Adam, and LBFGS).

artificial intelligence, deep learning, machine learning, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback